34 research outputs found
Superpixel-based Semantic Segmentation Trained by Statistical Process Control
Semantic segmentation, like other fields of computer vision, has seen a
remarkable performance advance by the use of deep convolution neural networks.
However, considering that neighboring pixels are heavily dependent on each
other, both learning and testing of these methods have a lot of redundant
operations. To resolve this problem, the proposed network is trained and tested
with only 0.37% of total pixels by superpixel-based sampling and largely
reduced the complexity of upsampling calculation. The hypercolumn feature maps
are constructed by pyramid module in combination with the convolution layers of
the base network. Since the proposed method uses a very small number of sampled
pixels, the end-to-end learning of the entire network is difficult with a
common learning rate for all the layers. In order to resolve this problem, the
learning rate after sampling is controlled by statistical process control (SPC)
of gradients in each layer. The proposed method performs better than or equal
to the conventional methods that use much more samples on Pascal Context,
SUN-RGBD dataset.Comment: Accepted in British Machine Vision Conference (BMVC), 201
Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams
In this work, we introduce a new algorithm for analyzing a diagram, which
contains visual and textual information in an abstract and integrated way.
Whereas diagrams contain richer information compared with individual
image-based or language-based data, proper solutions for automatically
understanding them have not been proposed due to their innate characteristics
of multi-modality and arbitrariness of layouts. To tackle this problem, we
propose a unified diagram-parsing network for generating knowledge from
diagrams based on an object detector and a recurrent neural network designed
for a graphical structure. Specifically, we propose a dynamic graph-generation
network that is based on dynamic memory and graph theory. We explore the
dynamics of information in a diagram with activation of gates in gated
recurrent unit (GRU) cells. On publicly available diagram datasets, our model
demonstrates a state-of-the-art result that outperforms other baselines.
Moreover, further experiments on question answering shows potentials of the
proposed method for various applications